Convexification and Deconvexification for Training Neural Networks
نویسندگان
چکیده
This paper presents a new method of training neural networks including deep learning machines, which is based on the idea of convexifying the training error criterion by the use of the risk-averting error (RAE) criterion. Convexification creates tunnels between the depressed regions around saddle points, tilts the plateaus, and eliminates nonglobal local minima. The difficulties in computing the RAE and its gradient and in selecting the value of its risk-sensitivity index λ are eliminated with the normalized RAE (NRAE). The new method, called gradual deconvexification (GDC), starts with the NRAE with a very large λ, gradually decreases it, and switches to the RAE as soon as the RAE becomes computationally manageable. This way, the gradients of the plateaus in the training error criterion are effectively but not excessively raised. Numerical experiments show the effectiveness of GDC as compared with unsupervised pretraining, which is the state of the art in training deep learning machines. After the minimization process is terminated by crossvalidation, a statistical pruning method is used to enhance the generalization capability of the resultant neural network. Numerical results show further reduction of the testing criterion.
منابع مشابه
Classification of ECG signals using Hermite functions and MLP neural networks
Classification of heart arrhythmia is an important step in developing devices for monitoring the health of individuals. This paper proposes a three module system for classification of electrocardiogram (ECG) beats. These modules are: denoising module, feature extraction module and a classification module. In the first module the stationary wavelet transform (SWF) is used for noise reduction of ...
متن کاملHandwritten Character Recognition using Modified Gradient Descent Technique of Neural Networks and Representation of Conjugate Descent for Training Patterns
The purpose of this study is to analyze the performance of Back propagation algorithm with changing training patterns and the second momentum term in feed forward neural networks. This analysis is conducted on 250 different words of three small letters from the English alphabet. These words are presented to two vertical segmentation programs which are designed in MATLAB and based on portions (1...
متن کاملEstimation of Daily Evaporation Using of Artificial Neural Networks (Case Study; Borujerd Meteorological Station)
Evaporation is one of the most important components of hydrologic cycle.Accurate estimation of this parameter is used for studies such as water balance,irrigation system design, and water resource management. In order to estimate theevaporation, direct measurement methods or physical and empirical models can beused. Using direct methods require installing meteorological stations andinstruments ...
متن کاملA framework for parallel and distributed training of neural networks
The aim of this paper is to develop a general framework for training neural networks (NNs) in a distributed environment, where training data is partitioned over a set of agents that communicate with each other through a sparse, possibly time-varying, connectivity pattern. In such distributed scenario, the training problem can be formulated as the (regularized) optimization of a non-convex socia...
متن کاملPREDICTION OF COMPRESSIVE STRENGTH AND DURABILITY OF HIGH PERFORMANCE CONCRETE BY ARTIFICIAL NEURAL NETWORKS
Neural networks have recently been widely used to model some of the human activities in many areas of civil engineering applications. In the present paper, artificial neural networks (ANN) for predicting compressive strength of cubes and durability of concrete containing metakaolin with fly ash and silica fume with fly ash are developed at the age of 3, 7, 28, 56 and 90 days. For building these...
متن کامل